Combining Heterogeneous Knowledge Resources for Improved Distributional Semantic Models

نویسندگان

  • György Szarvas
  • Torsten Zesch
  • Iryna Gurevych
چکیده

The Explicit Semantic Analysis (ESA) model based on term cooccurrences in Wikipedia has been regarded as state-of-the-art semantic relatedness measure in the recent years. We provide an analysis of the important parameters of ESA using datasets in five different languages. Additionally, we propose the use of ESA with multiple lexical semantic resources thus exploiting multiple evidence of term cooccurrence to improve over the Wikipedia-based measure. Exploiting the improved robustness and coverage of the proposed combination, we report improved performance over single resources in word semantic relatedness, solving word choice problems, classification of semantic relations between nominals, and text similarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Enriching Lexical Semantic Resources with Distributional Semantics

We present an approach to combining distributional semantic representations induced from text corpora with manually constructed lexical-semantic networks. While both kinds of semantic resources are available with high lexical coverage, our aligned resource combines the domain specificity and availability of contextual information from distributional models with the conciseness and high quality ...

متن کامل

Towards a Distributional Semantic Web Stack

The capacity of distributional semantic models (DSMs) to discover similarities over large scale heterogeneous and poorly structured data brings them as a promising universal and low-effort framework to support semantic approximation and knowledge discovery. This position paper explores the role of distributional semantics in the Semantic Web vision, based on state-of-the-art distributional-rela...

متن کامل

Building Semantic Networks from Plain Text and Wikipedia with Application to Semantic Relatedness and Noun Compound Paraphrasing

The construction of suitable and scalable representations of semantic knowledge is a core challenge in Semantic Computing. Manually created resources such as WordNet have been shown to be useful for many AI and NLP tasks, but they are inherently restricted in their coverage and scalability. In addition, they have been challenged by simple distributional models on very large corpora, questioning...

متن کامل

Synonym extraction and abbreviation expansion with ensembles of semantic spaces

BACKGROUND Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, ...

متن کامل

Towards an Approximative Ontology-Agnostic Approach for Logic Programs

Distributional semantics focuses on the automatic construction of a semantic model based on the statistical distribution of colocated words in large-scale texts. Deductive reasoning is a fundamental component for semantic understanding. Despite the generality and expressivity of logical models, from an applied perspective, deductive reasoners are dependent on highly consistent conceptual models...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011